rbcL and Legume Phylogeny, with Particular Reference to Phaseoleae, Millettieae, and Allies
نویسندگان
چکیده
A parsimony analysis was conducted on 319 rbcL sequences, comprising 242 from 194 genera of Leguminosae and 77 from other families. Results support earlier conclusions from rbcL and other molecular data that a monophyletic Leguminosae is part of a Fabales that includes Polygalaceae, Surianaceae, and the anomalous rosid genus Quillaja. Within legumes, results of previous analyses were also supported, such as the paraphyletic nature of Caesalpinioideae and monophyly of Mimosoideae and Papilionoideae. Most new data (74 sequences) were from Papilionoideae, particularly Phaseoleae, Millettieae, and allies. Although the overall topology for Papilionoideae was largely unresolved, several large clades were well-supported. The analysis contained a large sample of Phaseoleae and Millettieae, and not surprisingly showed both tribes to be polyphyletic, though with all taxa except Wisteria and allied Millettieae belonging to a single well supported clade. Within this clade was a strongly supported group that included Phaseoleae subtribes Erythrininae, Glycininae, Phaseolinae, Kennediinae, and Cajaninae, with only the last two being monophyletic. Desmodieae and Psoraleeae were also part of this clade. The monophyletic Phaseoleae subtribes Ophrestiinae and Diocleinae grouped with most Millettieae in a clade that included a group similar to the core Millettieae identified in other studies. All but one of the remaining Millettieae sampled formed an additional clade within the overall rnillettioid/phaseoloid group. Of the various genes used for plant molecular phylogenetic relationships are unknown to be hysystematic analyses at higher taxonomic levels, rbcL pothesized simply and quickly. has been by far the most widely used, particularly The rbcL gene has played a role in the evolving for comprehensive analyses of angiosperms, understandkg of legume phylogeny. The earliest whether alone (e.g., Chase et al. 1993; Kallersjo et comprehensive cladistic analyses of legume phylogal. 1998) or with other genes (e.g., Qiu et al. 1999; eny with broad sampling were those of Chappill Soltis et al. 1999). Although several limitations of (19951, using a wide array of characters, and-rbc~ rbcL for angiosperm phylogeny reconstruction have studies by two groups (Doyle 1995; Kass and Wink been known since the earliest studies (e.g., Chase 1995). Both groups subsequently expanded these et al. 1993), the gene continues to be used in part studies (Kass and Wink 1996, 1997a, 199%; Doyle because comparable sampling of a readily alignable et al. 1997). Results from these studies were largely sequence does not exist elsewhere. The availability concordant with earlier molecular work, confirming of thousands of rbcL sequences in public databases for example the monophyly of groups with struc(over 8,000 as of late 20001, representing all major tural mutations of the chloroplast genome (e.g., groups of plants, allows the affinities of taxa whose Lavin et al. 1990; Doyle et al. 1996), and with long516 SYSTEMATIC BOTANY [Volume 26 standing views concerning the monophyly (or lack thereof) of the three subfamilies. Major groups in rbcL topologies were in many cases unresolved or weakly supported, particularly near the base of the tree in the paraphyletic Caesalpinioideae (Doyle et al. 1997). However, several large clades were identified within Papilionoideae, some of which were previously unknown, and several of which were well-supported. Phylogenetic analyses of the combined sequences from the two 1997 studies have not been published, and numerous new legume rbcL sequences have been generated since then. Moreover, none of the legume rbcL phylogenies included many outgroups. The sister group relationships of legumes are controversial, with molecular results in conflict with traditional views. The availability of a large number of rbcL sequences both from legumes and from putatively related taxa makes it possible to study the effect of extensive legume sampling on outgroup relationships and of outgroup sampling on topologies within. With recent improvements in computer hardware and software, as well as in search strategies, it is now possible to perform more thorough parsimony searches of tree space for large data sets (e.g., Nixon 1999). The goal of this paper is to conduct such a parsimony analysis on the large number of available legume rbcL sequences and numerous outgroups. Taxon Sampling. The sample of approximately 250 Leguminosae sequences publicly available at the commencement of this project was biased toward some groups, particularly the papilionoid tribe Genisteae, which had been the focus df studies by the Wink laboratory (e.g., Kass and Wink 1997a, b). There was some overlap in genera and in some cases even species sampled between our group (Doyle et al. 1997) and the Wink group (Kass and Wink 199%). Initial parsimony analyses were conducted in order to develop a data-set that minimized redundancy and excessive sampling of genera such as Luainus L. (Kass and Wink 1997a). Relatively few oi our many Desmodieae sequences were included here, because relationships in this tribe will be discussed elsewhere. No more than two sequences were retained for any genus whose sequences were monophyletic in such analyses; for genera whose sequences did not form monophyletic groups (e.g., Sophora L.), all sequences were used. For species having multiple representative sequences but which did not belong to multiply-sampled genera, all sequences were used unless they were identical. The resulting data set of 242 legume sequences represented 194 genera (Table 1)and included 74 new sequences. Emphasis was on Papilionoideae, with sequences from 164 of the 451 genera and all 30 of the tribes recognized by Polhill (1994), whose classification is used throughout this section. For Caesalpinioideae, 24 of 151 genera were included, representing all four tribes (Caesalpinieae, Cassieae, Cercideae, Detarieae). This included five of the nine informal "groups" of Caesalpinieae, four of the five subtribes of Cassieae, and both subtribes of Cercideae. However, only five genera were included from the large (81 genera) Detarieae, representing four of the 10 informal "groups." Outside of Detarieae, sampling deficiencies were due mostly to the difficulty in obtaining usable material. For example, numerous attempts to obtain sequences from collections of Duparquetia Baill. (Cassieae: Duparquetiinae) and Poeppigia Presl (Caesalpinieae: Poeppigia group) were unsuccessful. Sampling was lowest for Mimosoideae, with only six genera represented. However, this subfamily has been assumed to be monophyletic. Seventy-seven sequences from families other than Leguminosae were also included (Table 1). These were chosen to represent: 1) families shown by previous comprehensive rbcL analyses (e.g., Chase et al. 1993; Soltis et al. 1995; Kallersjo et al. 1998) to belong to clades near legumes; 2) families hypothesized to be near legumes on the basis of morphology, chemistry, and other non-molecular data (Dickison 1981; Thorne 1992); and 3) families identified as close to legumes by the molecular, nonmolecular, or combined analyses of Nandi et al. (1998).Asarum (Aristolochiaceae) was included as the outgroup to this assembly of largely "rosid" taxa. One new sequence was added, from Byrsocarpus coccinea, as a check on the position of Connaraceae, a key family from which only a single sequence (from Connarus conchocarpus) was publicly available. Phylogenetic Analysis. The first 1,434 bases of the rbcL gene were aligned in Winclada (Nixon 1999b); the first 30 positions, corresponding to the forward amplification primer, were not used in analyses. Approximately 2% of 530 parsimony-informative sites were missing in the data set, primarily at the extreme 3' or 5' ends of sequences. Among legume taxa, only partial sequences were 20011 KAJITA ET AL.: rbcL AND LEGUME PHYLOGENY 517 available for Dialium (335 of 530 informative positions), Hymenaea protera (122/530), Hymenolobium excelsum (235/530), Fordia caulifora (269/530), and Strongylodon macrobotrys (316/530). The data matrix is available at TreeBASE (http://www.herbaria. harvard.edu/treebase/) as study accession number S578. Parsimony analyses were conducted using NONA (Goloboff 1994), with nucleotide characters treated as unordered and equally weighted. Searches were conducted using the "parsimony ratchet" strategy, which has been shown to be very effective with data sets in excess of 500 terminals (Nixon 1999a), sampling tree space more efficiently than conventional methods (e.g., many iterations of random taxon additions optimizing all characters using equal weights). A typical ratchet analysis begins with a conventional starting tree from randomly ordered taxa (a single random addition sequence) and then initiates an iterative analysis consisting of the following steps: 1)perturbation of the matrix by increasing the weights of, or eliminating, a random small subset of characters; 2) branch swapping to obtain one representative shortest tree; 3) resetting weights to original values; 4) branch swapping with equal weights using the perturbed tree as the starting tree. The cycle is repeated by starting with the tree that resulted from the previous iteration and perturbing the data to start step one over again. A large number of iterations are conducted in a single ratchet analysis, with all equally parsimonious trees being retained. The efficiency of this method is attributed to the fact that shortest trees found with perturbed characters are not most parsimonious solutions, but are close enough that they serve as excellent starting trees for unperturbed analyses. The starting tree and weighting scheme also quickly jumps between tree islands. The use of such trees is a major improvement over conventional random addition trees, which are far from parsimonious and require considerable searching to achieve near-optimality (Nixon 1999a). Ratchets were implemented as described by Nixon (1999a) using Winclada (Nixon 1999b) to run NONA (Goloboff 1994). Following the guidelines presented by Nixon (1999a), the matrix was analyzed by perturbing 10-20% of the informative characters (weighting step). Individual ratchet runs used the following parameters: 200 iterations, 5090 characters sampled, 10% of nodes constrained holding one tree per iteration, and default "ambigpoly=" (no swapping on ambiguously supported nodes). Constraining a subset of nodes during the character weighted tree search greatly increases the speed of the ratchet (Nixon 1999a). Nodes were unconstrained during the equally weighted search. Considerations of memory, topologies obtained, and support values for individual clades led to a decision to run sufficient ratchet analyses (435 in the case of the final complete analysis) to accumulate a total of at least 5,000 unique equally parsimonious trees, from which a strict consensus tree was then constructed. Branch support values for the strict consensus tree were estimated using one hundred strict consensus bootstrap (Davis et al. 1998) replicates in NONA (Goloboff 1994) spawned in Winclada (Nixon 1999b). For each bootstrap tree, ten random addition sequences using TBR (tree bisection and reconnection) and holding ten trees per replication were conducted (100 replications of mult*lO; h/ 10-no max*). The bootstrap values were plotted onto the ratchet strict consensus tree in Winclada and indicate the percentage of the bootstrap trees that contained each consensus clade. Jackknife clade support (Farris et al. 1996) was also estimated using WinClada to spawn jackknife replicates in Nona. One hundred replicates were conducted using 10 random addition sequences (mult*lO) holding 10 shortest trees for each replication (hold/lO). The complete rbcL data set included 1,404 aligned bases with 530 potentially informative characters among the 319 sequences analyzed. 5,700 equally most parsimonious trees were accumulated in 435 ratchets; each tree had a length of 5,997 steps (excluding uninformative characters), an ensemble consistency index of 0.16, and an ensemble retention index of 0.67 For most systematic purposes there is no need to identify all equally parsimonious trees (even when it is possible to do so) because if tree space is searched thoroughly and many tree islands are sampled, no changes in the strict consensus topology will occur as more trees are included from individual islands (Farris et al. 1996; Goloboff 1999; Nixon 1999a). The ratchet strategy is designed to identify many more islands than would be found in a comparable time using a conventional strategy (Nixon 1999a); thus the strict consensus from the 5,700 trees obtained during our searches is unlikely to collapse further if more trees had been saved. We initially tested these contentions in analyses 518 SYSTEMATIC BOTANY [Volume 26 TABLE1. Taxa sampled. Voucher information (collection and herbarium abbreviation) and GenBank accession numbers are given for rbcL sequences reported here for the first time. For samples for which rbcL sequences were reported elsewhere, only the GenBank number is provided here. Non-legumes (outgroups) are listed in alphabetical order, by genus, with family given in parentheses following the accession number. Legumes are listed by subfamily and tribe, following Polhill (1994) with the exception of H m a , classified as Brongniartieae following Crisp and Weston (1987). NON-LEGUMES: Acer saccharum Marsh. LO1881 (Sapindaceae); Aesculus pmia Castigl. U39277 (Sapindaceae); Ailanthus altissima Swingle L12566 (Simaroubaceae); Alnus incana (L.) Moench X56618 (Betulaceae); Aporusa frutescens Blume 275674 (Euphorbiaceae); Asarum canadense L. L14290 (Aristolochiaceae); Balanops vieillardi Boill. AF089760 (Balanopaceae); Bauera rubioides N. Andr. L11174 (Saxifragaceae); Begonia metallica x sanguinea Maddi LO1888 (Begoniaceae); Brassic~z oleracea L. M88342 (Brassicaceae); Byrsocarpus coccinea Benth. AF308704 (Connaraceae: Herendeen 9-XI-97-7, US); Casuarina litorea Stickm. LO1893 (Casuarinaceae); Ceanothus sanguineus Pursh U06795 (Rhamnaceae); Celtis sinensis var. japunica (Planchon) Nakai D86309 (Ulmaceae); Celtis yunnanensis C.K. Schneid. L12638 (Ulmaceae); Chrysobalanus icaco L. L11178 (Chrysobalanaceae); Citrus paradisi Macfad. AJ238407 (Rutaceae); Clarkia xantiana A. Gray LO1896 (Onagraceae); Cleome hassleriana Chodat M95755 (Capparaceae); Comesperma ericinum DC. L29492 (Polygalaceae); Connarus conchocarpus F. Muell. L29493 (Connaraceae); Coriaria myrtifolia L. LO1897 (Coriariaceae); Corynocarpus lamigatus J.R. Forst. & G. Forst. X69731 (Corynocarpaceae); Crossosoma californicum Nutt. L11179 (Crossosomataceae); Cucumis sativus L. L21937 (Cucurbitaceae); Datisca cannabina L. L21939 (Datiscaceae); Drypetes roxburghii (Wall.) Hurus. M95757 (Euphorbiaceae); Elaeagnus angustifolia L. U17038 (Elaeagnaceae); Elaeocarpus grandis E Muell. L28951 (Elaeocarpaceae); Eucryphia lucida (Labill.) Baill. LO1918 (Cunoniaceae); Euonymus alatus (Thunb.) Siebold L13184 (Celastraceae); Euphorbia polychroma A. Kern L13185 (Euphorbiaceae); Fagus grandifolia Ehrh. L13338 (Fagaceae); Gironniera subaequalis Planch. D86311 (Ulmaceae); Gossypium robinsonii F, Muell. L13186 (Malvaceae); Guaiacum sanctum L. AJ131770 (Zygophyllaceae); Guilfoylia monostylis F. Muell. L29494 (Surianaceae); Heteropyxis natalensis Harv. U26326 (Heteropyxidaceae); Humulus lupulus L. AF061992 (Cannabaceae); Hymenanthera alpina (T. Kirk) W.R.B. Oliv. 275692 (Violaceae); Iuglans nigra L. U00437 (Juglandaceae); Koelreuteria paniculata Laxm. U39283 (Sapindaceae); Krameria lanceolata Torr. Y15032 (Krameriaceae); Leitneria floridana Chapm. AF062003 (Simaroubaceae); Licania t o m t o s a (Benth.) Fritsch L11193 (Chrysobalanaceae); Maclura pomifera (Raf.) C.K. Schneid. D86318 (Moraceae); Magnolia tripetala (L.) L. AJ131927 (Magnoliaceae); Myrica cerifera L. LO1934 (Myricaceae); Opilia Roxb. sp. AJ131773 (Opiliaceae); Oxalis dillenii Jacq. LO1938 (Oxalidaceae); Photinia x fraseri Dress L11200 (Rosaceae); Pilea pumila (L.) A. Gray U00438 (Urticaceae); Platytheca wticillata Baill. LO1944 (Tremandraceae); Polygala cruciata L. LO1945 (Polygalaceae); Prunus domestica L. LO1947 (Rosaceae); Punica granatum L. L10223 (Punicaceae); Qualea Aubl. sp. U02730 (Vochysiaceae); Quillaja saponaria Molina QSU06822 (Rosaceae); Rhamnus cathartica L. L13189 (Rhamnaceae); Rhiptelea chiliantha Diels & Hand.-Mazz. AF017687 (Rhoipteleaceae); Rinorea crenata S.F. Blake AJ237591 (Violaceae);Santalum album L. L26077 (Santalaceae); Saxifraga mertensiana Bong. U06216 (Saxifragaceae); Schinus molle L. U39270 (Anacardiaceae); Securidaca diversifolia (L.) S.F. Blake LO1955 (Polygalaceae); Shepherdia canadensis (L.) Nutt. U17039 (Elaeagnaceae); Simarouba glauca DC. U38927 (Simaroubaceae); Spiraea X vanhouttei Zabel L11206 (Rosaceae); Sterculia tragacantha Lindl. AF022126 (Sterculiaceae); Stylobasium spathulatum Desf. U06828 (Surianaceae); Suriana maritima L. U07680 (Surianaceae); Swietenia macrophylla King U39080 (Meliaceae); Toxicoddron radicans (L.) Kuntze U39271 (Anacardiaceae); Trema micrantha (L.) Blume U03844 (Ulmaceae); Viola sororia Willd. L11674 (Violaceae); Viscum album L. L26078 (Viscaceae); Zygophyllum simplex L. Y15031 (Zygophyllaceae). LEGUMINOSAE CAESALPINIOIDEAE: Caesalpinieae: Acrocarpus Wight & Am. sp. AF308699 (Manos 1416, DUKE); Caesalpinia pulcherrima (L.) Sw. U74190; Caesalpinia pulcherrima (L.) Sw. 270153; Delonix regia (Bojer ex Hook.) Raf. 270156; Erythrophleum ivorense A. Chev. U74205; Gleditsia triacanthos L. 270129; Gymnocladus dioica (L.) K. Koch U74193; Parkinsonia muleata L. 270157; Peltophorum peltatum U74183; Peltophorum (Vogel) Benth. sp. U74184; Tachigali paniculata Aubl. U74240. Cassieae: Apuleia leiocarpa (Vogel) J.F. Macbr. U74249; Cassia fistula L. U74195; Cassia senna L. 270155; Ceratonia siliqua L. U74203; Chamaecristafasciculata (Michx.)Greene U74187; Dialium L. sp. U74259; Petalostylis labicheoides R. Br. AF308719 (Clemens s.n., BH); Senna alata (L.) Roxb. U74250; Senna didymobotrya Fresen. 270154 (deposited as Cassia didymobotrya); Zenia insignis Chun AF308722 (Pacific Tropical Garden 82~19, HI). Cercideae: Bathinia candicans Benth. 270161; Bauhinia purpurea DC. ex Welp. 270162; Cercis canadensis L. U74188; Cercis siliquastrum L. 270164. Detarieae: Brawnea Jacq. sp. U74186; Hymennen protera G. 0. Poinar L08477; Macrolobium acaciifolium (Benth.) Benth. U74191; Peltogyne confirtiflora Benth. AF308718 (Bridgewater 793, RBGE); Tamarindus indica L. 270160. MIMOSOIDEAE: Acacieae: Acacia farnesiana (L.) Willd. 270146. Ingeae: Albizia julibrissin Durazz. 270147; Albizia saman (Jacq.)E Muell. 270149; Paraserianthes lophantha (Willd.) I.C. Nielsen 270148; Pithecellobium mexicanum L. 270150. Mimoseae: Mimosa speggazzinii Pirotta 270151. Parkieae: Parkia roxburghii G. Don U74209. PAPILIONOIDEAE: Abreae: Abrus precatorius L. U74224. Adesmieae: Adesmia exilis Clos U74254. Aeschynomeneae: Aeschynomene ammicanu L. AB045784 (H.Ohash et al. f.n. 12, TUS); A e s c h y n o m indica L. AF308701 (Carulli 58, CHRB); Arachis hypogaea L. U74247; Zornia cantoniensis Mohlenbr. U74235. Amorpheae: Amorphafruticosa L. U74212. Bossiaeeae: 20011 KAJITA ET AL.: rbcL AND LEGUME PHYLOGENY 519
منابع مشابه
Phylogenetic systematics of the tribe Millettieae (Leguminosae) based on chloroplast trnK/matK sequences and its implications for evolutionary patterns in Papilionoideae.
Phylogenetic relationships in the tribe Millettieae and allies in the subfamily Papilionoideae (Leguminosae) were reconstructed from chloroplast trnK/matK sequences. Sixty-two accessions representing 57 traditionally recognized genera of Papilionoideae were sampled, including 27 samples from Millettieae. Phylogenies were constructed using maximum parsimony and are well resolved and supported by...
متن کاملBauhinia larsenii , a fossil legume from Guangxi, China
.Graham A. 1992. The current status of the legume fossil record in the Caribbean region. In: Herendeen PS, Dilcher DL, eds. Advances in legume systematics, Part 4. The fossil records. London: Royal Botanical Gardens, Kew, 161– 167.Herendeen PS, Crepet WL, Dilcher DL. 1992. The fossil history of the Leguminosae. Phylogenetic and biogeographic implications. In: Herendeen PS, Dilcher DL, e...
متن کاملA phylogeny of legumes (Leguminosae) based on analysis of the plastid matK gene resolves many well-supported subclades within the family.
Phylogenetic analysis of 330 plastid matK gene sequences, representing 235 genera from 37 of 39 tribes, and four outgroup taxa from eurosids I supports many well-resolved subclades within the Leguminosae. These results are generally consistent with those derived from other plastid sequence data (rbcL and trnL), but show greater resolution and clade support overall. In particular, the monophyly ...
متن کاملMolecular Phylogeny of the “temperate Herbaceous Tribes” of Papilionoid Legumes: a Supertree Approach
Molecular phylogenies provide a framework to discuss relationships in the vast temperate herbaceous radiation of papilionoid legumes comprised of tribes Galegeae, Carmichaelieae, Cicereae, Hedysareae, Trifolieae, Vicieae, as well as among some members of the tropical tribe Millettieae (Callerya, Wisteria, and related genera). The taxa form a monophyletic group marked by the loss of the chloropl...
متن کاملA Phylogeny of Legumes (leguminosae) Based on Analysis of the Plastid Matk Gene Resolves Many Well-supported Subclades within the Family1
Phylogenetic analysis of 330 plastid matK gene sequences, representing 235 genera from 37 of 39 tribes, and four outgroup taxa from eurosids I supports many well-resolved subclades within the Leguminosae. These results are generally consistent with those derived from other plastid sequence data (rbcL and trnL), but show greater resolution and clade support overall. In particular, the monophyly ...
متن کامل